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SMALLER AND LOWER POWER STATIC MUX CIRCUITRY IN 
GENERATING MULTIPLIER PARTIAL PRODUCT SIGNALS 



BACKGROUND 

5 Computer multiplication circuits accept a multiplicand and a multiplier and 

generate a product. One straightforward method to multiply binary numbers is the 
long form of multiplication. This is the standard shift-and-add approach. That is, 
for each column in a multiplier, shift the multiplicand the appropriate number of 
columns and add the shifted multiplicand into the final total if the multiplier column 

10 contains a one, or don't add it if the if the multiplier colunm contains a zero. The 
shifted numbers to be added are called partial products because they represent 
intermediate results in determining the final product of the multiply. All of the 
partial products are added together to determine the final product. Thus, the number 
of shifts to be executed is equal to the number of columns in the multiplier, and the 

15 number of partial products to be added is equal to the number of ones in the 
muhiplier. 

This method of multiplying is slow and there have been methods developed 
to speed up the multiplying process. One method to speed up multiplying is to use 
radix-four multiplication or Booth multiplication. Instead of shifting and adding for 

20 every column of the multiplier, the method uses every other column. Booth 
encoding involves looking at three consecutive bits of a multiplier to determine 
whether to multiply the multiplicand by -1, +1, -2, +2, or zero to obtain a partial 
product. This method reduces the number of partial products to be added by one- 
half, and consequently reduces the complexity and power consumption of circuits 

25 that implement the method. 

The Booth encoding process consists of looking at three bits of a multiplier 
to determine how to calculate a partial product. As an example the hexadecimal 
number 4E2 is shown below. 

30 01001 1100010_ 
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In long multiplication by the shift and add method, twelve partial products would be 
used to determine the final product. To determine the partial products for Booth 
encoding, the number is grouped into the three-bit blocks. The least significant 

5 block begins with only the two least significant bits of the multiplier and zero is 
used as the least significant bit (LSB) of the block. Grouping starts at the LSB and 
each block overlaps the previous block by one bit. The most significant block is 
sign extended if necessary to fill out a block. Based on the three bits in the block, 
the multiplicand is multiplied by -1, +1, -2, +2, or 0 to obtain the partial product. 

10 Table 1 shows the encoding used for each possible three-bit block. 



Table 1 



Bit Block 


Partial product 


000 


0 * Multiplicand 


001 


+1 * Multiplicand 


010 


+1 * Multiplicand 


Oil 


+2 * Multiplicand 


100 


-2 * Multiplicand 


101 


-1 * Multiplicand 


110 


-1 * Multiplicand 


111 


0 * Multiplicand 



Starting with the LSB in the example above, the six blocks are 100, 001, 100, 1 1 1, 
001 and 010. The multiplicand is then multipHed by -2, +1, -2, 0, +1 and +1 to 

15 obtain the six partial products. The partial products are shifted according to which 
block is decoded and then added together to obtain the final product. 

Despite reducing the number of partial products by one-half, Booth 
multiplication can still result in complex circuits. Typically, computers that want to 
obtain the result of a multiply as quickly as possible use a circuit to calculate each 

20 bit of the partial products. So, for example, a 64-bit by 64-bit multiply that uses the 
shift and add method needs to add 64 partial products of 64 bits each; or 4096 
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circuits. A 64-bit by 64-bit multiply that uses Booth encoding reduces the number 
of partial products to be added by one-half. However, this method still requires 
adding 32 partial products of 64 bits each; or 2048 circuits. 

It can be seen from this discussion that reducing the complexity of multiply 
5 circuits would result in significant savings of space used in fabricating the circuits 
and saving of power consumption in the operation of such circuits. 

BRIEF DESCRIPTION OF THE DRAWINGS 
In the drawings like numerals refer to like components throughout the 
10 several views. 

FIG. 1 is a block diagram of an embodiment of a circuit to generate one 
partial product for a Booth multiply. 

FIG. 2 is one embodiment of a multiplexer circuit for generating a partial 
product for multiplying using Booth encoding. 
15 FIG. 3 shows one embodiment of a Booth encoder circuit. 

FIG. 4 shows another embodiment of a Booth encoder circuit. 
FIG. 5 is an illustration of a partial product array circuit. 
FIG. 6 is a block diagram of one embodiment of a multiply circuit that uses a 
partial product array circuit. 
20 FIG. 7 is an illustration of one embodiment of a muUiplexer circuit with 

increased fanout capability. 

FIG. 8 is a block diagram of one embodiment of a multiply circuit that uses a 
partial product circuit with an accumulator. 

FIG. 9 shows one embodiment of a method of multiplying a multiplicand 
25 and a multiplier. 

FIG. 10 shows a computer system that includes a multiply circuit. 



DETAILED DESCRIPTION 
In the following detailed description, reference is made to the accompanying 
30 drawings which form a part hereof, and in which is shown by way of illustration 
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specific embodiments in which the invention may be practiced. It is to be 
understood that other embodiments may be used and structural changes may be 
made without departing from the scope of the present invention. 

This document describes circuits to generate partial products for 
5 multiplication. The partial products are generated for a type of multiplication 
known as radix-four multiplication, or Booth multiplication. 

To generate the partial products, multiplexing can be used. In multiplexing, 
signals corresponding to the Booth-multiply functions of multiply by 0, -1, +1, -2, 
and +2 are used to select a bit for output from the muhiplexer. FIG. 1 is a block 

10 diagram of an embodiment of a circuit 100 to generate one partial product for a 

Booth multiply. Inputs to the partial product circuit 100 are bits of the multiplicand 
105, their complements, bits of the multiplier 1 10 and their complements. The 
circuit 100 includes multiplexers 115. There is one multiplexer 1 15 for each bit of 
the multiplicand 105. The multiplier bits 1 10 are encoded using Booth encoder 

15 circuit 120 to generate Booth control signals 125. The Booth control signals 125 are 
distributed to multiplexers 1 15 to select outputs to generate bits of the partial 
product 130. The Booth control signals 125 include signals to generate a negative 
result (NEG), a positive result (POS), a muUiply by one (Ml) and a multiply by two 
(M2). 

20 FIG. 2 is one embodiment of a multiplexer circuit 215. The circuit includes 

four pass gates 210, 220, 230, 240. In the circuit 215, the pass gates 210, 220, 230, 
240 are shown implemented with field effect transistors (FETs). The FETs shown 
are N-type, but the multiplexer circuit 215 can also be implemented with P-type 
transistors or with bipolar transistors. In the multiplexer circuit 215, the gate of the 

25 first transistor 210 is coupled to receive a multiplicand bit Zj and the gate of the 
second transistor 220 is coupled to receive the complement to the multiplicand bit. 
The drains of the transistors 210, 220 are coupled together to provide an output 
Zj+i The source of the first transistor 210 is coupled to the POS control signal. 
The source of the second transistor 220 is coupled to the NEG control signal. 

30 Because only one of Zj* or its complement is active at any one time, the two 
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transistors implement the truth table below in Table 2. In the table, "1" denotes an 
active state such as high, "0" denotes an inactive state such as low, and "X" denotes 
a don't care state. Although the description above uses specific source and drain 
connections, one of ordinary skill in the art would understand the source and drain 
5 regions are interchangeable. 



Table 2 



Zj 


NEG 


POS 


Zj+i' 


0 


0 


X 


0 


0 


1 


X 


1 


1 


X 


0 


0 


1 


X 


1 


1 



The middle two rows of the table show that the multiplicand bit is inverted at 

10 the Zj+i ' output by an active NEG signal or an inactive POS signal. It is also shown 
that if the NEG and POS control signals are both inactive, the Zj+i' output is 
necessarily "0." Figure 2 also shows that the intermediate output Zj+i ' is provided 
to a next higher order multiplexer stage. The drains of the third and fourth 
transistors 230, 240 are also tied together. If the control signal Ml is active, the 

1 5 third transistor 230 passes the Zj+i ' drain output to the multiplexer output, PPj, to 
effect a multiply by one of the Zj+i ' drain output. If the M2 output is active, the 
fourth transistor 240 passes the Zj.i ' intermediate output from the next lower 
multiplexer stage to the output to effect a left shift for the multiply by two. Zj.i' 
corresponds to Zj+i' fi-om the next lower stage. The multiplexer output, PPj, 

20 corresponds to one bit of the resultant partial product. 

FIG. 3 shows one embodiment 320 of a Booth encoder circuit. The circuit 
320 uses logic circuits to encode a three-bit block of a multiplier, ya, y2, yi into four 
control signals NEG, POS, Ml and M2 corresponding to a negative result, a positive 
result, multiply by one and multiply by two. The Booth encoder circuit 320 is used 

25 to encode any block of three bits of the muUiplier block. Although specific 
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embodiments are discussed conceming the Booth encoder circuit 320, one skilled in 
the art of logic circuit design would understand from reading this description that 
similar logic circuits to produce the indicated control signals are within the scope of 
this document. 

5 The embodiment 320 shows four logic circuits. The first logic circuit 

generates the NEG signal. The first logic circuit includes a two-input NAND gate 
325 where the inputs to the NAND gate 325 are first and second bits ya, yi of the 
multiplier block. The first logic circuit also includes a transmission gate, or T-gate, 
330 coupled to the output of the NAND gate 325. A third multiplier bit ya and its 

10 complement activate the T-gate 330. A pull-down transistor 335 is coupled to the 
output of the T-gate 330. The complement of the third multiplier ys bit activates the 
pull down transistor 335. The output of the T-gate 330 provides the NEG signal. 

A second logic circuit generates the POS signal In the embodiment 320, the 
second logic circuit includes a two-input NAND gate 340, where the inputs to the 

15 NAND gate 340 are complements of the first and second bits y2, yi of the multiplier 
block. A T-gate 345 is coupled to the output ofthe NAND gate 340. The third 
multiplier bit ys and a complement of the third multiplier bit activate the T-gate 345. 
The output of the T-gate 345 provides the POS signal. A pull-down transistor 350 is 
coupled to the output T-gate 345. The third muhiplier bit ys activates the pull down 

20 transistor 350. 

A third logic circuit generates the Ml signal. In the embodiment 320, the 
third logic circuit includes a two-input XOR gate 355, where the inputs to the XOR 
gate 355 are first and second bits yi, y\ of a multiplier block and the output of the 
XOR gate 355 is the Ml signal. 

25 A fourth logic circuit generates the M2 signal. In the embodiment 320, the 

fourth logic includes a two-input XNOR gate 360, where the inputs to the XNOR 
gate 360 are first and second bits ya, yi of a multiplier block and the output of the 
XNOR gate 360 is M2 signal. 

FIG. 4 shows another embodiment 420 of a Booth encoder circuit. The 

30 embodiment 420 includes four logic circuits 425, 435, 355, 360 to encode the NEG, 
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POS, Ml and M2 signals and includes a fifth logic circuit 430 that detects when the 
three bits ya, y2, yi of the multiplier block are not ones or all zeros. The first logic 
circuit includes a two-input AND gate 425. One input to the AND gate 425 is the 
output of the fifth logic circuit 430 and a second input is the third bit ys of a 
5 multiplier block. The output of the AND gate 425 provides the NEG signal. The 
second logic circuit includes a two-input AND gate 435. One input to the AND gate 
435 is the output of the fifth logic circuit 430 and a second input is the complement 
of the third bit ya of a multiplier block. The output of the AND gate 435 provides 
the POS signal. The third and fourth logic circuits include the XOR gate 355 to 

10 provide the Ml signal and the XNOR gate 360 to provide the M2 signal as 
discussed previously in FIG. 3. 

FIG. 5 is an illustration of a partial product array circuit 500. The circuit 
500 includes a partial product circuit 100 for each partial product required by the 
multiply operation. Each partial product circuit 100 includes one Booth encoder 

15 circuit 120 to encode three-bits of a multiplier block and N muUiplexer circuits for 
each bit of the multiplicand 105. The number of partial product circuits 100 used in 
the array 500 is equal to one-half the number of bits in the multiplier 110. Thus, if 
the multiply operation requires multiplying an M-bit multiplier 1 10 by an N-bit 
multiplicand 105, the number of partial product circuits 100 is M/2. Thus the 

20 number of multiplexer circuits required is equal to M/2 times N. The letters M and 
N represent an integer number of bits usefiil in multiply circuits. Some examples 
include sixteen, thirty-two and sixty-four. 

A multiplying operation where M and N are relatively large integers shows 
the advantages of a multiplexer circuit that minimizes the number of pass gates in a 

25 multiplexer. For example, an integrated circuit implementation of a multiply 

operation that uses a sixty-four bit multiplier and a sixty-four multiplicand would 
require 32*64 or 2048 multiplexers. Use of the four-transistor multiplexer 215 of 
FIG. 2 over a multiplexer that uses five transistors would save 2048 transistors on 
the integrated circuit. 
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FIG. 6 is a block diagram of one embodiment of a multiply circuit 600 that 
uses a partial product array circuit 500. The partial product array circuit 500 
receives the bits of the muhiplicand and the multiplier and generates the partial 
products. An adder circuit 610 receives the partial products and them together along 
5 with generating any carries to form the final product. 

As the size of the multiplicand increases, the Booth control signals need to 
be distributed to more multiplexers. As the size of the multiplier increases, the 
multiplexer outputs may have to drive longer interconnect lines. FIG. 7 is an 
illustration of one embodiment 700 of a multiplexer circuit 715 with increased 

10 fanout capability. In the embodiment, the NEG and POS signals are inverted at the 
output of the Booth encoder circuit 120 using inverting buffers 750. The partial 
product output of the multiplexer circuit 715 is also inverted using inverting buffer 
760 thus preserving the logic sense of the partial product output. In another 
embodiment, the buffers are non-inverting. 

1 5 FIG. 8 is a block diagram of one embodiment of a multiply circuit 800 that 

uses a partial product circuit 100 with an accumulator 810. The partial products are 
calculated one at a time and the intermediate results are stored in the accumulator 
810. The partial product circuit 100 receives the bits of the multiplicand. A shift 
register 820 receives the bits of the multiplier and applies three bits of the multiplier 

20 at a time to the Booth encoder circuit 120. The partial product is then shifted if 
necessary and added to the accumulator 810 using the arithmetic logic unit (ALU) 
830. The result is stored back into the accumulator 810. The accumulator 810 
holds the final product after all of the partial products are calculated and added to 
the accumulator 810. 

25 FIG. 9 is an embodiment of a method 900 of multiplying multiplicand and a 

multiplier. At 910, four control signals are generated to implement Booth encoding 
fiinctions of negative, positive, zero, multiply by one and muUiply by two, firom bits 
of a multipUer. At 920, bits of a muhiplicand are multiplexed in accordance with 
the control signal fiinctions to generate partial products. The multiplexing includes 

30 interconnecting intermediate stages of muhiplexers fi-om lower order bit positions to 
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next higher order positions. At 930, the partial products are added to obtain the 
final product. 

FIG. 10 is a block diagram of one embodiment of a computer system 1000 
that uses a partial product circuit 100. The computer system includes a system bus 
5 1 0 1 0 for communicating information and a processor 1 0 1 5 for processing 

information. The processor 1015 includes a multiplier circuit 1020 that includes a 
partial product circuit 100 for calculating partial products of a multiply operation. A 
cache memory 1030 is coupled to the processor 1015 for storing information 
firequently used by the processor 1015, and a main memory 1035 is coupled to the 

10 processor 1015 including random access memory (RAM) for storing information 
and instructions, including multiply instructions, for the processor 1015. 

A read only memory (ROM) 1045 or other non-volatile storage device for 
storing fixed information for the processor 1015 is coupled to the system bus 1010. 
Other components such as a mass storage device 1040, a display device 1050, a 

15 keyboard 1055 or other input device and a cursor control device 1060 may also be 
included in the computer system 1000. 

Although specific examples have been illustrated and described herein, it 
will be appreciated by those of ordinary skill in the art that any arrangement 
calculated to achieve the same purpose could be substituted for the specific example 

20 shown. This application is intended to cover any adaptations or variations of the 
present invention. Therefore, it is intended that this invention be limited only by the 
claims and the equivalents shown. 
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